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(1) 

(i) 



SEQUENCE LISTING 
GENERAL INFORMATION: 



APPLICANT! 



Donson, Jon / 
Dawson, Will'iam 0. 
Grantham, G4orge L. 
Turpen, Thomas H. 
Turpen, Ann Myers 
Garger, Sycephen J. 
Grill, Laurence K. 



TITLE OF INVENTION: RECOMBINANT PLANT VIRAL 



(ii) 
NUCLEIC ACIDS 

(iii) NUMBER OF SEQUENCES: /ll 

(iv) CORRESPONDENCE ADDRl^SS : 

(A) ADDRESSEE: Limpach & Limbach 

(B) STREET: 2001 Ferry Building 

(C) CITY: San Fra/icisco 

(D) STATE: CAL 
(F) ZIP: 94111 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: /[BM PC compatible 

(C) OPERATING /SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE:/ Patent in Release #1.0, 
Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICiyriON NUMBER; 

(B) FILING/DATE: 

(C) CLASSLFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLLCATION NUMBER: US 600,244 

(B) FILIi/g DATE: 22-OCT-1990 

(vii) PRIOR APPLICATION DATA: 

(A) . APPLICATION NUMBER: US 641,617 

(B) FILING DATE: 16-JAN-1991 

(vii) PRIOR ^PLICATION DATA: 

(A) APPLICATION NUMBER: US 310,881 

(B) FILING DATE: 17-FEB-1989 

(vii) PRIOR/ APPLICATION DATA: 

(A) APPLICATION NUMBER: US 160,766 

(B) /FILING DATE: 26-FEB-1988 

(vii) PRiqte APPLICATION DATA: 

APPLICATION NUMBER: US 160,771 
FILING DATE: 26-FEB-1988 



• 



• 
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(vii) 



(vii) 



(vii) 



(viii) 



(ix) 



PRIOR APPLICATION DATA:/ 
(A) APPLICATION NUMBEJ 



(B) FILING DATE: 05-MAY-1989 



US 347,637 



PRIOR APPLICATION DATA: 
(A) APPLICATION NUH3ER: 



(B) FILING DATE: Oa-JUN-1989 



US 363,138 



PRIOR APPLICATION 
(A) APPLICATION 



(B) 



9ATA: 
IBER: US 219,279 



FILING DATE: /15-JUL-1988 



ATTORNEY /AGENT INFORMATION: 

(A) NAME: HalliVin, Albert P. 

(B) REGISTRATION NUMBER: 28,957 

(C) REFERENCE//DOCKET NUMBER: BIOG-20121 USA 



TELECOMMUNICAI 

(A) TELEPHONJ 

(B) TELEFAX: 



ON INFORMATION: 
415-433-4150 
415-433-8716 



(2) INFORMATION/ FOR SEQ ID NO: 1; 

(1) SEQUENCE C^LARACTERISTICS ; 

(A) LENGTH: 4 amino acids 

(B) TYPEtf amino acid 
(D) TOPoioGY: linear 

(ii) MOLECULE /TYPE : peptide 

(iii) HYPOTHETJICAL: NO 

(iv) ANTI-SEJISE: NO 

(Xi) SEQUENdE DESCRIPTION: SEQ ID NO: 1: 

Pro Xaa Gly/ Pro 
1 

(2) INFORMATION FOR SEQ ID NO: 2 

(i) SEQUENCE CHARACTERISTICS: 

(A) l.ENGTH: 13 base pairs 

(B) 'i-YPE: nucleic acid 

(C) JTRANDEDNESS: single 
TOPOLOGY: linear 



(ii) 
(iii) 



(D) 
MOLEC 



JLE TYPE: DNA (genomic) 



hypot:ietical: NO 



17 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ I^ NO: 2: 

GGGTACCTGG GCC 



13 



(2) INFORMATION FOR SEQ ID 

(i) SEQUENCE CHARACTERISTK 

(A) LENGTH: 886 base fcairs 

(B) TYPE: nucleic acif 

(C) STRANDEDNESS : si/gle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA /genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: /:hinese cucumber 

(vii) IMMEDIATE SOURfcE: 

(B) CLONE: a)!pha-trichosanthin 

(ix) FEATURE: 

(A) NAME/K^: CDS (B) LOCATION: 8 

(B) LOCATION: 8. .877 

(xi) SEQUENCE DjiSCRIPTION: SEQ ID NO: 3: 

CTCGAGG ATG ATC AGA T/tC TTA GTC CTC TOT TTG CTA ATT 

Met lie Arg yPhe Leu Val Leu Ser Leu Leu lie 
1/5 10 



. 877 

CTC ACC cTcas 
Leu Thr Leu 



TTC CTA ACA ACT CCf GCT GTG GAG GGC GAT GTT AGC TTC CGT TTA TCffl? 



Phe Leu Thr Thr Pfo Ala Val Glu Gly Asp Val Ser Phe 
15 / 20 25 



Arg Leu Ser 
30 



GGT GCA ACA AGC /VGT TCC TAT GGA GTT TTC ATT TCA AAT CTG AGA ABISftS 
Gly Ala 



Thr Ser/ Ser Ser Tyr Gly Val Phe He Ser Asn 
35 40 



GCT CTT CCA AAJT GAA AGG AAA CTG TAC GAT ATC CCT CTG 
Ala Leu Pro A^n Glu Arg Lys Leu Tyr Asp He Pro Leu 



Leu Arg Lys 
45 

TTA CGT TOCSS 
Leu Arg Ser 



-97- 



TCT CTT 
Ser Leu 



50 

CCA GGT 

Pro Gly 
65 



GCC GAT GAA AGO 

Ala Asp Glu Thr 
80 



ATG GGA TAT CGC GCT GGC 
Tyr Arg 



Met Gly 
95 

GCA ACA 

Ala Thr 



TCT CAA CGC TAC 

Ser Gin Arg Tyr 
70 

ATT TCA GTG GCC 

lie Ser Val Ala 
85 

GAT ACA 

Asp Thr 



Ala Gly 
100 



Sei/ Tyr Phe 
105 



Phe Lys Asp 
120 



ACG CTT CCA TAT 

Thr Leu Pro Tyr 
130 

AAA ATA AGG GAA 

Lys lie Arg Glu 
145 



Glu Arg Leu 
135 



ATT ACC ACT TTG 

lie Thr Thr Leu 
160 

ATG GTA CTC ATT 

Met Val Leu lie 
175 

GAG CAA 
Glu Gin 

GCA ATT 
Ala lie 




60 

CAT CTC ACA AAT Tfflai 

His Leu Thr Asn Tyr 
75 

ACG AAC GTC TAT ATZB9 

Thr Asn Val Tyr lie 
90 



TTC AAC 
Phe Asn 



GAG GCT TCHI37 
Glu Ala 



GAA GCT GCA AAA 

Glu Ala Ala Lys 
115 

TCT GGC 
Ser Gly 

AAT ATT 
Asn lie 



C AAA GAG GCT ATG CGA AAA 
Ala Met 



Arg Lys 
125 



GAA AGG CTT CAA ACT GCT GCG 
Gin Thr 



'Pro Leu 
150 



GGA CTC CCA GCT TTG 

Gly Leu Pro Ala Leu 
155 



Ala Ala 
140 

GAC AGT 
Asp Ser 



Ser 
110 

GISBS 

Val 

00X33 
Gly 

GGX81 
Ala 



TAC AAC GCC AAT TCT GCT GCG 
Ala Asn Ser 



TTG GAA 
Leu Glu 



Tyr Asn 
165 



ACG TCT GAG GCT GCG 

Thr Ser Glu Ala Ala 
185 

CGC GTT GAC AAA ACC 

Arg Val Asp Lys Thr 
200 

AAT AGT TGG TCT GCT 

Asn Ser Trp Ser Ala 
215 



Ala Ala 
170 



TCG GCA CHE 9 
Ser Ala Leu 



AGG TAT AAA TTT A'EI77 

Arg Tyr Lys Phe lie 
190 

TTC CTA CCA AGT T'BS5 

Phe Leu Pro Ser Leu 
205 

CTC TCC AAG CAA A'BI73 

Leu Ser Lys Gin lie 
220 



11 



• 
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CAG ATA GCG AGT ACT AAT AAT GGA CAG 
Ser Thr Asn Asn 



Gin lie Ala 
225 



Gly Gin 
230 



TTT GAA ACT/CCT GTT GTG CTTEl 

Phe Glu TY/c Pro Val Val Leu 
235 



ATA AAT GCT CAA AAC CAA CGA GTC ATG ATA ACC 

Val Met 



lie Asn Ala 
240 



Gin Asn Gin Arg 
245 



lie Th]f Asn 
250 



GTT GAT GCT GGB69 
Val Asp Ala Gly 



GTT GTA ACC TCC AAC ATC GCG TTG CTG CTG IfAT CGA AAC AAT ATG G(Sa.7 

Leu Leu 



Val Val Thr 
255 



Ser Asn lie Ala 
260 



'Asn Arg 
265 



Asn Asn Met Ala 
270 



GCC ATG GAT GAC GAT GTT CCT 

Ala Met Asp Asp Asp Val Pro 
275 

TAT GCT ATT TAGTAACTCG AG 
Tyr Ala lie 



ATG ACA 
Met Thrj 



AGC TTT GGA TGT GGA ACSC5 



Gin Ser Phe 
280 



Gly Cys Gly Ser 
285 



886 



(2) INFORMATION FOR ^EQ ID NO: 4 

(i) SEQUENCE CHARAO^TERISTICS ; 

(A) LENGTH: 2^9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOCy: linear 

(ii) MOLECULE TYJ'e: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 



Met lie Arg Phe Leu A^al Leu Ser Leu Leu lie Leu Thr Leu Phe Leu 

1 5/ 10 15 

Thr Thr Pro Ala VaA Glu Gly Asp Val Ser Phe Arg Leu Ser Gly Ala 

20 / 25 30 

Thr Ser Ser Ser ^yr Gly Val Phe lie Ser Asn Leu Arg Lys Ala Leu 

35 / 40 45 



Pro Asn Glu 
50 



Lys Leu Tyr Asp lie Pro Leu Leu Arg Ser Ser Leu 
55 60 



Pro Gly Ser Gljn Arg Tyr Ala Leu He His Leu Thr Asn Tyr Ala Asp 
65 / 70 75 80 
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Glu Thr lie Ser Val Ala lie Asp Val Thr AsiV Val Tyr He Met Gly 
85 90 / 95 

Tyr Arg Ala Gly Asp Thr Ser Tyr Phe Phe Xsn Glu Ala Ser Ala Thr 
100 105 / HO 

Glu Ala Ala Lys Tyr Val Phe Lys Asp AL& Met Arg Lys Val Thr Leu 
115 120 / 125 

Pro Tyr Ser Gly Asn Tyr Glu Arg Leu/cin Thr Ala Ala Gly Lys He 
130 135 / 140 

Arg Glu Asn He Pro Leu Gly Leu p/o Ala Leu Asp Ser Ala He Thr 
145 150 / 155 160 

Thr Leu Phe Tyr Tyr Asn Ala Asi^ Ser Ala Ala Ser Ala Leu Met Val 
165 / 170 175 

Leu He Gin Ser Thr Ser Glu sfta Ala Arg Tyr Lys Phe He Glu Gin 
180 / 185 190 

Gin He Gly Lys Arg Val Ast^ Lys Thr Phe Leu Pro Ser Leu Ala He 
195 7 200 205 

He Ser Leu Glu Asn Ser "^p Ser Ala Leu Ser Lys Gin He Gin He 
210 £15 220 

Ala Ser Thr Asn Asn Glj/ Gin Phe Glu Thr Pro Val Val Leu He Asn 
225 23$) 235 240 

Ala Gin Asn Gin Arg vkl Met He Thr Asn Val Asp Ala Gly Val Val 
245 / 250 255 

Thr Ser Asn He Ala/ Leu Leu Leu Asn Arg Asn Asn Met Ala Ala Met 
260 / 265 270 

Asp Asp Asp Val Pi^o Met Thr Gin Ser Phe Gly Cys Gly Ser Tyr Ala 
275 / 280 285 

He 



(2) INFO^IMATION FOR SEQ ID NO: 5; 

(i) SEQJ&ENCE CHARACTERISTICS: 

(A)/ LENGTH: 1452 base pairs 
(By TYPE: nucleic acid 
(CI STRANDEDNESS : single 
(Cft TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



0 
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HYPOTHETICAL: NO 
ANTI -SENSE: NO 

ORIGINAL SOURCE: 

(A) ORGANISM: Or^za sativa 

IMMEDIATE SOURCE:) 

(B) CLONE: alph^-amylase 

FEATURE : 

(A) NAME/KEY: fcOS (B) LOCATION: 12. .1316 

(B) LOCATION: 112. .1316 

(Xi) SEQUENCE DESCiyrPTION: SEQ ID NO: 5: 

CCTCGAGGTG C ATG CAG GTG CTG AAC ACC ATG GTG AAC A CAC TTC TTG 50 

Met Gin ValVLeu Asn Thr Met Val Asn Lys His Phe Leu 
1/5 10 



TCC CTT TCG GTC CTC 



GTC CTC CTT GGC CTC TCC TCC AAC TTG ACi98 



Ser Leu Ser Val Leu lie Val Leu Leu Gly Leu Ser Ser Asn Leu Thr 

15 / 20 25 

GCC GGG CAA GTC CTG/TTT CAG GGA TTC AAC TGG GAG TCG TGG AAG GAKSS 

Ala Gly Gin Val LejS Phe Gin Gly Phe Asn Trp Glu Ser Trp Lys Glu 
30 / 35 40 45 



AAT GGC GGG TGG 
Asn Gly Gly Trp, 



AAC TTC CTG ATG GGC AAG GTG GAC GAC ATC GCK94 

Asn Phe Leu Met Gly Lys Val Asp Asp lie Ala 
50 55 60 



GCA GCC GGC ATC ACC CAC GTC TGG CTC CCT CCG CCG TCT CAC TCT G'ISCA2 

Ala Ala Gly lie Thr His Val Trp Leu Pro Pro Pro Ser His Ser Val 
/es 70 75 

GGC GAG CAA fcSGC TAC ATG CCT GGG CGG CTG TAG GAT CTG GAC GCG TC2B0 

Gly Glu Gln^Gly Tyr Met Pro Gly Arg Leu Tyr Asp Leu Asp Ala Ser 

85 90 

AAG TAC GGfc AAC GAG GCG CAG CTC AAG TCG CTG ATC GAG GCG TTC CfflBS 



Lys Tyr G]/y Asn Glu Ala Gin Leu Lys Ser Leu lie Glu Ala Phe His 
95 / 100 105 
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GGC AAG GGC GTC CAG GTG 
Gly Val 



Gly Lys 
110 



Gin Val 
115 



GCG GAG 
Ala Glu 



CAC AAG GAC GGC 

His Lys Asp Gly 
130 



ATC GCC GAC ATC GtC 

lie Ala Asp tie ifal 
'l20 

CGC GGC ATC T^C TGC 

Arg Gly lie /tyr Cys 
'l35 



ATC AAC 
lie Asn 



CAC CGC 
His Arg 



CTC TTC GAG GGC 

Leu Phe Glu Gly 
140 



ACSS6 

Thr 
125 

GCa34 

Gly 



ACG CCC GAC TCC 

Thr Pro Asp Ser 
145 

GAC CCC TAC GGC 

Asp Pro Tyr Gly 
160 

GCC GCC GCG CCG 

Ala Ala Ala Pro 
175 



CGC CTC 
Arg Leu 



GAC TGG 
Asp Trp 



ly Pro His 
'150 



ATG ATC 
Met lie 



TGC CGC GiaS2 

Cys Arg Asp 
155 



CAT GGC ACC GQ 

Asp Gly Thr aly 
165 

GAC ATC GAC CAC 

Asp lie A/sp His 
L80 



AAC CCG GAC ACC GGC 

Asn Pro Asp Thr Gly 
170 

CTC AAC AAG CGC GTC 

Leu Asn Lys Arg Val 
185 



GCC GAC 
Ala Asp 



T'HSO 
Phe 



CTC ATT GGC TGG CTC GAi 
Gly Trp 



Leu lie 
190 



TGG CGC 
Trp Arg 



Leu A^p 

5 



CTC GAC TTC /GCC 

Leu Asp Ph(fe Ala 
0 



TAC ATC GAC GCC 

Tyr lie Asp Ale 
22i 



,CC GAG 
Thr Glu 



TGG CTC AAG ATG GAC 

Trp Leu Lys Met Asp 
200 

AAG GGC TAC TCC GCC 

Lys Gly Tyr Ser Ala 
215 

CCG AGC TTC GCC GTG 

Pro Ser Phe Ala Val 
230 



ATC GGC 
lie Gly 

GAC ATG 
Asp Met 



CAG CGG GJB78 
Gin Arg Glu 

TTC GAC GCS26 

Phe Asp Ala 
205 

GCA AAC ATa74 

Ala Lys lie 
220 



CCC GAG ATA TCG AOS 2 

Ala Glu lie Trp Thr 
235 



TCC ATG 
Ser Met 



Ala [Hen 
240 



CAC CGG CACJ GAG 

His Arg GVn Glu 
255 



GGC GGG GAC GGC 

Gly Gly Asp Gly 
245 

CTG GTC AAC TGG 

Leu Val Asn Trp 
260 



AAG CCG AAC TAC GAC 

Lys Pro Asn Tyr Asp 
250 

GTC GAT CGT GTC GGC 

Val Asp Arg Val Gly 
265 



CAG AAC 
Gin Asn 



GCXSO 
Ala 



GGC GCC AJSaS 
Gly Ala Asn 



ACC AAC OGC ACG GCG TTC GAC TTC ACC ACC AAG GGC ATC CTC AAC G'B£6 



• t 
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Ser Asn Gly Thr Ala Phe Asp Phe Thr Thr l/s Gly lie Leu Asn Val 
270 275 -IZO 285 

GCC GTG GAG GGC GAG CTG TGG CGC CTC CGQ' GGC GAG GAG GGC AAG G(Sa.4 

Ala Val 



Glu Gly Glu 
290 



Leu Trp Arg Leu Ifcq Gly Glu Asp Gly Lys Ala 
95 300 



CGC GGC ATG ATC GGG TGC TGG COG GC^ AAG GOG ACG ACC TTC GTC G7Sa62 
Pro Gly 



Met lie Gly 
305 



Trp Trp Pro /(la Lys Ala Thr Thr Phe Val Asp 
10 315 



AAC CAC GAG ACC GGC TCG ACG C^G CAC CTG TGG CCG TTC CCC TCC QTOCIO 
Asn His 



Asp Thr Gly 
320 



Ser Thr yGln His Leu Trp Pro Phe Pro Ser Asp 
325 330 



AAG GTC ATG CAG GGC TAG GjCA TAG ATC CTC ACC CAC CCC GGC AAC 00368 
Met Gin Gly 



Lys Val 
335 



Tyr ^la Tyr lie Leu Thr His Pro Gly Asn Pro 
340 345 



TGC ATC TTG TAC GAG Cj(T TTC TTC GAT TGG GGT CTC AAG GAG GAG AtDOOe 
Phe Tyr Asp 



Cys lie 
350 



lis Phe Phe Asp Trp Gly Leu Lys Glu Glu lie 
'355 360 365 



GAG CGC CTG GTG TCJPl ATC AGA AAC CGG CAG GGG ATC CAC CCG GCG fllt3C54 
Glu Arg 



Leu Val ser 
J70 



lie Arg Asn Arg Gin Gly lie His Pro Ala Ser 
375 380 



GAG CTG CGC AT<7 ATG GAA GCT GAG AGC GAT CTC TAC CTC GCG GAG 73321)2 
Glu Leu 



Arg IJte Met 
J5 



Glu Ala Asp Ser Asp Leu Tyr Leu Ala Glu lie 
390 395 



GAT GGC AAG /GTG ATC AC A AAG ATT GGA CCA AGA TAC GAC GTC GAA CJfflSO 
Asp Gly 



Lyy Val lie 
40/ 



Thr Lys lie Gly Pro Arg Tyr Asp Val Glu His 
405 410 



CTC ATC C(pC GAA GGC TTC CAG GTC GTC GCG CAC GGT GAT GGC TAC (3398 
ro Glu Gly 



Leu lie 
415 



Phe Gin Val Val Ala His Gly Asp Gly Tyr Ala 
420 425 



ATC TGG /GAG AAA ATC TGAGCGCACG ATGACGAGAC TCTCAGTTTA GCAGATT'n?S63 
Glu Lys Lie 



lie Tri 
430 



435 



of 



% 
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CCTGCGATTT TTACCCTGAC CGGTATACGT ATATJACGTGC CGGCAACGAG 
CTGTATCCGA 



1413 



TCCGAATTAC GGATGCAATT GTCCACGAAG TCCTCGAGG 



1452 



(2) INFORMATION FOR S^Q ID NO: 6: 

(i) SEQUENCE CHARACDfeRISTICS: 

(A) LENGTH: A2A amino acids 

(B) TYPE: amino acid 
(D) Topology:/ linear 

(ii) MOLECULE TYP]^: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6; 

Met Gin Val Leu Asn Thit Met Val Asn Lys His Phe Leu Ser Leu Ser 
1 5 / 10 15 

Val Leu He Val Leu yLeu Gly Leu Ser Ser Asn Leu Thr Ala Gly Gin 
20 / 25 30 

Val Leu Phe Gin GlJ/ Phe Asn Trp Glu Ser Trp Lys Glu Asn Gly Gly 
35 / 40 45 

Trp Tyr Asn Phe iLeu Met Gly Lys Val Asp Asp He Ala Ala Ala Gly 
50 / 55 60 

He Thr His Va^ Trp Leu Pro Pro Pro Ser His Ser Val Gly Glu Gin 
65 / 70 75 80 

Gly Tyr Met ^ro Gly Arg Leu Tyr Asp Leu Asp Ala Ser Lys Tyr Gly 
85 90 95 

Asn Glu Al^ Gin Leu Lys Ser Leu lie Glu Ala Phe His Gly Lys Gly 
100 105 110 

Val Gin Vjil He Ala Asp He Val He Asn His Arg Thr Ala Glu His 
L5 120 125 

Lys Asp fcly Arg Gly He Tyr Cys Leu Phe Glu Gly Gly Thr Pro Asp 
130/ 135 140 

Ser Ard Leu Asp Trp Gly Pro His Met He Cys Arg Asp Asp Pro Tyr 
145 / 150 155 160 

Gly A^p Gly Thr Gly Asn Pro Asp Thr Gly Ala Asp Phe Ala Ala Ala 
165 170 175 




• 




Asp Phe 
210 

Ala Thr 
225 



lie Asp His Leu 
180 

Asp Trp Leu Lys 
195 

Ala Lys Gly Tyr 



Glu Pro Ser Phe 
230 



-104- 



Asn Lys Arg Val Gl/i Arg Glu 
185 

Met Asp lie Gly/Phe Asp Ala 

200 / 205 

Ser Ala Asp ^et Ala Lys lie 
215 / 220 

Ala Val Ayk Glu lie Trp Thr 
235 



Leu lie Gly 
190 

Trp Arg Leu 
Tyr lie Asp 



Ser Met Ala 
240 



Asn Gly Gly Asp Gly Lys 
245 



Pro Asi/Tyr Asp Gin Asn Ala 
250 



His Arg Gin 
255 



Glu Leu Val Asn Trp Val 
260 



Asp ^rg Val Gly Gly Ala Asn 
265 



Ser Asn Gly 
270 



Thr Ala Phe Asp Phe Thr 
275 



T^fr Lys Gly lie Leu Asn Val 
280 285 



Ala Val Glu 



Gly Glu 
290 



Leu Trp Arg Leui 



Arg Gly Glu Asp Gly Lys Ala Pro Gly Met 
295 300 



lie Gly 
305 



Trp Trp Pro PfLsi 
10 



Lys Ala Thr Thr Phe Val Asp 
315 



Asn His Asp 
320 



Thr Gly 
Gin Gly 
Tyr Asp 



Val Ser 
370 



Ser Thr Gl/ His 

3; 

Tyr Ala /?yr lie 
34Cy 

His PJne Phe Asp 
355 

11^ Arg Asn Arg 



Leu Trp Pro Phe Pro Ser Asp 
330 

Leu Thr His Pro Gly Asn Pro 
345 

Trp Gly Leu Lys Glu Glu He 
360 365 

Gin Gly He His Pro Ala Ser 
375 380 



Lys Val Met 
335 

Cys He Phe 
350 

Glu Arg Leu 
Glu Leu Arg 



He Met 
385 



Lu Ala Asp Ser 
390 



Asp Leu Tyr Leu Ala Glu He 
395 



Asp Gly Lys 
400 



Val He/ 



Thr Lys He Gly 
405 



Pro Arg Tyr Asp Val Glu His 
410 



Leu He Pro 
415 



Glu Gly 
Lys tllQ 



Phe Gin Val Val 
420 



Ala His Gly Asp Gly Tyr Ala 
425 



He Trp Glu 
430 



tip 
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(2) INFORMATION FOR SEQ/ID NO: 7; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 09 tease pairs 

(B) TYPE: nuclelfc acid 
(G) STRANDEDNESfi : single 
(D) TOPOLOGY: linear 



(ii) 
(iii) 

(iv) 
(vi) 

(vii) 

(ix) 

(ix) 
(xi) 

CTCGAGGGi 
AAGGGATfl/cA 



MOLECULE TYPE :/ cDNA to mRNA 

hypothetical / no 
anti-sense : /no 
original source: 

(A) ORGANISM: Homo sapiens 

IMMEDIATE SOURCE: 

(B) CLONE: alpha-hemoglobin 

FEATX 

(A) NAME/KEY: transit_peptide (B) LOCATION: 
'26. .241 

(B) / LOCATION: 26. .241 

/tURE : 
(AO NAME /KEY: CDS 
(B) LOCATION: 245. .670 

lEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TCTGATCTTT CAAGAATGGC ACAAATTAAC AACATGGCAC 



60 



AACCCTTAAT CCCAATTCCA ATTTCCATAA ACCCCAAGTT CCTAAATCTT 
CAAGTT7TTCT 



120 



TGTT^TTTGGA TGTAAAAAAC TGAAAATTC AGCAAATTCT ATGTTGGTTT TGAAAAAStBO 
TTTTT ATGCAAAAGT TTTGTTCCTT TAGGATTTCA GCAGGTGGTA 



GAi 



STTTCTTG 



240 



C^TG GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC 

289 

Val Leu Ser Pro Ala Asp Lys Thr Asn Val Lys Ala Ala Trp Cly 
15 10 15 

AAG GTT GGC GCG CAC GOT GGC GAG TAT GGT GCG GAG GCC CTG GAG A(237 

Lys Val Gly Ala His Ala Gly Glu Tyr Gly Ala Glu Ala Leu Glu Arg 
20 25 30 
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ATG TTC CTG TCC TTC CCC ACC ACC AAG 
Met Phe Leu 



Ser Phe Pro Thr 
35 



CTG AGC CAC GGC TCT GCC CAG 

Leu Ser His Gly Ser Ala Gin 
50 

GAC GCG CTG ACC AAC GCC GTG 
Thr Asn 



Thr Lys 
40 



Gly Lys 
60 



Asp Ala Leu 
65 



CTG TCC GCC CTG AGC 

Leu Ser Ala Leu Ser 
80 

GTC AAC TTC AAG CTC 
Val Asn Phe 



His Lys 
90 

CAC TGC CTG CTG 

His Cys Leu Leu 
105 



GAC ATG CCC AAC GQffil 

Asp Met Pro Asn Ala 
75 

CAC AAG CTT CGG GTG GAC CCS29 

Leu Arg 



(2) 
(i) 

(ii) 
(ix) 




CAC TTC GJaSS 

His Phe Asp 
45 

AAG GTG GQ)33 

Lys Val Ala 



Val Asp Pro 
95 



CAC CTC CCC 
His Leu Pro 



TTC CTG GCT 
TAAGCTGGAG 

Phe Leu Ala 
130 



CCT GCG GTG CAC 

Pro Ala Val His 
120 



GTG ACC CTG GCC GCS:77 

Val Thr Leu Ala Ala 
110 

GCC TCC CTG GAC AfS2 5 

Ala Ser Leu Asp Lys 
125 



AGC ACC GTG CTG ACC TCC AAA TAC CGT 



677 



Val Ser Thr 



Val Leu 
135 



Thr Ser 



Lys Tyr 
140 



Arg 



CCTCGGTAGC CGTTCCTCCT GCCCGGTCGA CC 



INFORMATION FOR SEQ ID NO : 8 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

MOLECULE TYPE: protein 



SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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Val Leu Ser Pro Ala Asp Lya Thr Asn Val Lys Xla Ala Trp Gly Lys 
1 5 10 / 15 

Val Gly Ala His Ala Gly Glu Tyr Gly Ala gIm Ala Leu Glu Arg Met 
20 25 / 30 

Phe Leu Ser Phe Pro Thr Thr Lys Thr Ty/ Phe Pro His Phe Asp Leu 
35 40 / 45 

Ser His Gly Ser Ala Gin Val Lys Gly/His Gly Lys Lys Val Ala Asp 
50 55 / 60 

Ala Leu Thr Asn Ala Val Ala His Val Asp Asp Met Pro Asn Ala Leu 
65 70 / 75 80 

Ser Ala Leu Ser Asp Leu His a/h His Lys Leu Arg Val Asp Pro Val 
85 / 90 95 

Asn Phe Lys Leu Leu Ser Hi^ Cys Leu Leu Val Thr Leu Ala Ala His 
100 / 105 110 

Leu Pro Ala Glu Phe Thr ^ro Ala Val His Ala Ser Leu Asp Lys Phe 
115 / 120 125 

Leu Ala Ser Val Ser Thfr Val Leu Thr Ser Lys Tyr Arg 
130 / 135 140 



(2) INFORMATION/FOR SEQ ID NO: 9: 



(i) SEQUENitE CHARACTERISTICS: 

(A) LENGTH: 743 base pairs 

(B) "DYPE: nucleic acid 

(C) JSTRANDEDNESS : single 

(D) /TOPOLOGY: linear 



(ii) MO/iECULE TYPE: CDNA to mRNA 

(iii) IfYPOTHETICAL: NO 

(iv) /ANTI -SENSE: NO 

(vi) / ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

IMMEDIATE SOURCE: 

(B) CLONE: beta-hemoglobin 

FEATURE : 

(A) NAME/KEY: transit_peptide (B) LOCATION: 
26. .241 
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(B) LOCATION: 26.. 241 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 245.. 69^5 

(xi) SEQUENCE DESCRIPTION/ SEQ ID NO: 9: 



CTCGAGGGGA TCTGATCTTT CAAGAATGGC /iCAAATTAAC AACATGGCAC 
AAGGGATACA 



60 



AACCCTTAAT CCCAATTCCA ATTTCCATA^ ACCCCAAGTT CCTAAATCTT 
CAAGTTTTCT 



120 



TGTTTTTGGA TCTAAAAAAC 
TGAAAAAAGA 



TGAAAAAT/CC AGCAAATTCT ATGTTGGTTT 



180 



TTCAATTTTT ATGCAAAAGT TTTGTT^CTT TAGGATTTCA GCAGGTGGTA 
GAGTTTCTTG 



240 



GATG GTG CAC CTG ACT CCT GPfG GAG AAG TCT GCC GTT ACT GCC CTG TGG 

289 

Val His Leu Thr Pro (6lu Glu Lys Ser Ala Val Thr Ala Leu Trp 
1 5 / 10 15 

GGC AAG GTG AAC GTG GAT /GAA GTT GGT GGT GAG GCC CTG GGC AGG CSS? 

Gly Lys Val Asn Val Asrf Glu Val Gly Gly Glu Ala Leu Gly Arg Leu 
20 7 25 30 

CTG GTG GTC TAG CCT TyfeG ACC CAG AGG TTC TTT GAG TCC TTT GGG GfflBS 

Leu Val Val Tyr Pro ATrp Thr Gin Arg Phe Phe Glu Ser Phe Gly Asp 
35 / 40 45 

CTG TCC ACT CCT G^^ GCT GTT ATG GGC AAC CCT AAG GTG AAG GCT CPSS33 

Leu Ser Thr Pro flfep Ala Val Met Gly Asn Pro Lys Val Lys Ala His 
50 / 55 60 

GGC AAG AAA GTG/ CTG GGT GCC TTT AGT GAT GGC CTG GCT CAC CTG GMSl 

Gly Lys Lys Val Leu Gly Ala Phe Ser Asp Gly Leu Ala His Leu Asp 
65 / 70 75 

AAC CTC AAG CfGC ACC TTT GCC ACCA CTG AGT GAG CTG CAC TGT GAC AAG 

529 

Asn Leu Lys/ Gly Thr Phe Ala Thr Leu Ser Glu Leu His Cys Asp Lys 
80 / 85 90 95 



(b 



4 • 
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CTG CAC GTG GAT CCT GAG AGC TTC AGG CTQ'CTA GGC AAC GTG CTG 6^577 

Leu His Val Asp Pro Glu Ser Phe Arg I/eu Leu Gly Asn Val Leu Val 
100 yl05 110 

TGT GTG CTG GCG CAT CAC TTT GGC AA^ GAA TTC ACC CCA CCA GTG 0825 

Cys Val Leu Ala His His Phe Gly Eys Glu Phe Thr Pro Pro Val Gin 
115 /l20 125 

GCT 6CC TAT CAG AAA GTG GTG GOt GGT GTG GCT AAT GCC CTG GCC CPByS 

Ala Ala Tyr Gin Lys Val Val flla Gly Val Ala Asn Ala Leu Ala His 
130 A35 140 



AAG TAT CAC TAAGCTCGCT TTCTTGCTCT CCAATTTCTA TTAAAGGTTC 



722 



Lys Tyr His 
145 



CTTTGTGGGG TCGAGGTCGA 



743 



(2) INFORMATION FOR SEQ ID NO: 10; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 6 amino acids 

(B) "EYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLE9XJLE TYPE: protein 

(Xi) SEQyENCE DESCRIPTION: SEQ ID NO: 10 

Val His Leu Thr Pro Glu Glu Lys Ser Ala Val Thr Ala Leu 
1/5 10 



Trp Gly 
15 



Lys Val Aa/i Val Asp Glu Val Gly Gly Glu Ala Leu Gly Arg 
20 25 30 



Leu Leu 



Val Val yryr Pro Trp Thr Gin Arg Phe Phe Glu Ser 
35 40 

Ser Th^ Pro Asp Ala Val Met Gly Asn Pro Lys Val 

55 60 



Phe Gly Asp Leu 
45 

Lys Ala His Gly 



Lys Eys Val Leu Gly Ala Phe Ser Asp Gly Leu Ala 
65 / 70 75 



His Leu 



Asp Asn 
80 



Lei/ Lys Gly Thr Phe Ala Thr Leu Ser Glu Leu His Cys Asp Lys Leu 
85 90 95 



4 



i 
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His Val Asp Pro Glu Ser Phe Arg Leu Leu Glfi Asn Val Leu Val Cys 
100 105 / 110 

Val Leu Ala His His Phe Gly Lys Glu Phe/xhr Pro Pro Val Gin Ala 
115 120 / 125 

Ala Tyr Gin Lys Val Val Ala Gly Val ilia Asn Ala Leu Ala His Lys 
130 135 / 140 

Tyr His 
145 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17/ amino acids 

(B) TYPE: amirfo acid 
(D) TOPOLOGY;/ linear 

(ii) MOLECULE TYP^: peptide 

(V) FRAGMENT TYpE: N-terminal 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: alkalophilic Bacillus sp, 

(B) STRAIN: 38-2 

(vii) IMMEDIATE SOURCE: 

(B) CiraNE: beta-cyclodextrin 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: II: 

Ala Pro Asp Th/ Ser Val Ser Asn Lys Gin Asn Phe Ser Thr Asp Val 
1/5 10 15 

lie 




